Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
Add filters

Journal
Document Type
Year range
1.
International Journal of Advanced Computer Science and Applications ; 13(12):830-838, 2022.
Article in English | Web of Science | ID: covidwho-2308999

ABSTRACT

The number of social media users has increased. These users share and reshare their ideas in posts and this information can be mined and used by decision-makers in different domains, who analyse and study user opinions on social media networks to improve the quality of products or study specific phenomena. During the COVID-19 pandemic, social media was used to make decisions to limit the spread of the disease using sentiment analysis. Substantial research on this topic has been done;however, there are limited Arabic textual resources on social media. This has resulted in fewer quality sentiment analyses on Arabic texts. This study proposes a model for Arabic sentiment analysis using a Twitter dataset and deep learning models with Arabic word embedding. It uses the supervised deep learning algorithms on the proposed dataset. The dataset contains 51,000 tweets, of which 8,820 are classified as positive, 37,360 neutral, and 8,820 as negative. After cleaning it will contain 31,413. The experiment has been carried out by applying the deep learning models, Convolutional Neural Network and Long Short-Term Memory while comparing the results of different machine learning techniques such as Naive Bayes and Support Vector Machine. The accuracy of the AraBERT model is 0.92% when applying the test on 3,505 tweets.

2.
1st International Conference in Advanced Innovation on Smart City, ICAISC 2023 ; 2023.
Article in English | Scopus | ID: covidwho-2297802

ABSTRACT

Since its emergence in December 2019, there have been numerous news of COVID-19 pandemic shared on social media, which contain information from both reliable and unreliable medical sources. News and misleading information spread quickly on social media, which can lead to anxiety, unwanted exposure to medical remedies, etc. Rapid detection of fake news can reduce their spread. In this paper, we aim to create an intelligent system to detect misleading information about COVID-19 using deep learning techniques based on LSTM and BLSTM architectures. Data used to construct the DL models are text type and need to be transformed to numbers. We test, in this paper the efficiency of three vectorization techniques: Bag of words, Word2Vec and Bert. The experimental study showed that the best performance was given by LSTM model with BERT by achieving an accuracy of 91% of the test set. © 2023 IEEE.

3.
Machine Learning and Knowledge Extraction ; 5(1):128-143, 2023.
Article in English | Scopus | ID: covidwho-2263722

ABSTRACT

Many changes in our digital corpus have been brought about by the interplay between rapid advances in digital communication and the current environment characterized by pandemics, political polarization, and social unrest. One such change is the pace with which new words enter the mass vocabulary and the frequency at which meanings, perceptions, and interpretations of existing expressions change. The current state-of-the-art algorithms do not allow for an intuitive and rigorous detection of these changes in word meanings over time. We propose a dynamic graph-theoretic approach to inferring the semantics of words and phrases ("terms”) and detecting temporal shifts. Our approach represents each term as a stochastic time-evolving set of contextual words and is a count-based distributional semantic model in nature. We use local clustering techniques to assess the structural changes in a given word's contextual words. We demonstrate the efficacy of our method by investigating the changes in the semantics of the phrase "Chinavirus”. We conclude that the term took on a much more pejorative meaning when the White House used the term in the second half of March 2020, although the effect appears to have been temporary. We make both the dataset and the code used to generate this paper's results available. © 2023 by the authors.

4.
Expert Systems with Applications ; 223, 2023.
Article in English | Scopus | ID: covidwho-2263399

ABSTRACT

Because of the frequent occurrence of chronic diseases, the COVID-19 pandemic, etc., online health expert question-answering (HQA) services have been unable to cope with the rapidly increasing demand for online consultations. Building a virtual health assistant based on medical named entity recognition (NER) can effectively assist with the consultation process, but the unstandardized expressions within HQA text pose a serious challenge for medical NER tasks. The main goal of this study is to propose a novel deep medical NER approach based on a collaborative decision strategy (CDS), i.e., co_decision_NER (CDN), that can identify standard and nonstandard medical entities in the HQA context. We collected 10,000 question–answer pairs from HaoDF, extracted medical entities from 15 entity categories, and used a CDS to fuse the advantages of different NER models. Ultimately, CDN achieved a performance (precision = 84.50%, recall = 84.30%, F1 = 84.40%) that was significantly better than that of the state-of-the-art (SOTA) method. Our empirical analysis suggests that the entity types Disease (DIS), Sign (SIG), Test (TES), Drug (DRU), Surgery (SUR), Precaution (PRE), and Region (REG) can be most easily expressed arbitrarily in the doctor–patient interaction scenario of HQA services. In addition, CDN can identify not only standard but also nonstandard medical entities, effectively alleviating the severe out-of-vocabulary (OOV) problem faced by HQA services when performing medical NER tasks. The core contribution of this study is the development of a novel neural network model fusion algorithm that can improve the performance of entity recognition in medical domain-specific tasks. © 2023 Elsevier Ltd

5.
Race and Justice ; 13(1):55-79, 2023.
Article in English | Scopus | ID: covidwho-2241772

ABSTRACT

The current study attempts to compare anti-Asian discourse before and during the COVID-19 pandemic by analyzing big data on Quora, one of the most frequently used community-driven knowledge sites. We created two datasets regarding "Asians” and "anti-Asians” from Quora questions and answers between 2010 and 2021. A total of 1,477 questions and 5,346 answers were analyzed, and the datasets were divided into two time periods: before and during the COVID-19 pandemic. We conducted machine-learning-based topic modeling and deep-learning-based word embedding (Word2Vec). Before the pandemic, the topics of physical difference and racism were prevalent, whereas, after the pandemic, the topics of hate crime, the need to stop Asian hate crimes, and the need for the Asian solidarity movement emerged. Above all, the semantic similarity between Asian and Black people became closer, while the similarity between Asian people and other racial/ethnic groups was diminished. The emergence of negative and radical language, which increased saliently after the outbreak of the pandemic, and the considerably wider semantic distance between Asian and White people indicates that the relationship between the two races has been weakened. The findings suggest a long-term campaign or education system to reduce racial tensions during the pandemic. © The Author(s) 2022.

6.
2022 Annual Modeling and Simulation Conference, ANNSIM 2022 ; 54:438-449, 2022.
Article in English | Scopus | ID: covidwho-2233800

ABSTRACT

This study aims to build clusters of similar research papers. Text clustering for research articles is challenging because re-clustering is necessary to handle newly added papers. An incremental clustering algorithm is presented to find similar research papers for COVID-19 related literature. The proposed approach uses an incremental word embedding generation technique to extract feature vectors of the papers. The initial clustering is done by using the K-means algorithm by two NLP feature extraction models;TF-IDF and Word2vec. The clustering results show that the Word2vec outperforms the TF-IDF model. With increasing COVID-19 literature continuously, the ultimate focus is to add the newly published papers to the existing clusters without re-clustering. Title, , and full body of papers are considered for testing the proposed incremental algorithm. Clustering quality is evaluated by the Microsoft language similarity package, which shows clustering of the full-text body outperforms the and title of papers. © 2022 Society for Modeling & Simulation International (SCS)

7.
International Journal of Advanced Computer Science and Applications ; 13(12), 2022.
Article in English | ProQuest Central | ID: covidwho-2226288

ABSTRACT

The number of social media users has increased. These users share and reshare their ideas in posts and this information can be mined and used by decision-makers in different domains, who analyse and study user opinions on social media networks to improve the quality of products or study specific phenomena. During the COVID-19 pandemic, social media was used to make decisions to limit the spread of the disease using sentiment analysis. Substantial research on this topic has been done;however, there are limited Arabic textual resources on social media. This has resulted in fewer quality sentiment analyses on Arabic texts. This study proposes a model for Arabic sentiment analysis using a Twitter dataset and deep learning models with Arabic word embedding. It uses the supervised deep learning algorithms on the proposed dataset. The dataset contains 51,000 tweets, of which 8,820 are classified as positive, 37,360 neutral, and 8,820 as negative. After cleaning it will contain 31,413. The experiment has been carried out by applying the deep learning models, Convolutional Neural Network and Long Short-Term Memory while comparing the results of different machine learning techniques such as Naive Bayes and Support Vector Machine. The accuracy of the AraBERT model is 0.92% when applying the test on 3,505 tweets.

8.
Digit Health ; 8: 20552076221145426, 2022.
Article in English | MEDLINE | ID: covidwho-2195658

ABSTRACT

Objective: The present study aims to examine the threshold of coronavirus disease 2019 (COVID-19) vaccine hesitancy over time and public discourse around COVID-19 vaccination hesitancy. Methods: We collected 3,952 questions and 66,820 answers regarding COVID-19 vaccination posted on the social question-and-answer website Quora between June 2020 and June 2021 and employed Word2Vec and Sentiment Analysis to analyze the data. To examine changes in the perceptions and hesitancy about the COVID-19 vaccine, we segmented the data into 25 bi-weekly sections. Results: As positive sentiment about vaccination increased, the number of new vaccinations in the United States also increased until it reached a ceiling point. The vaccine hesitancy phase was identified by the decrease in positive sentiment from its highest peak. Words that occurred only when the positive answer rate peaked (e.g., safe, plan, best, able, help) helped explain factors associated with positive perceptions toward vaccines, and the words that occurred only when the negative answer rate peaked (e.g., early, variant, scientists, mutations, effectiveness) suggested factors associated with vaccine hesitancy. We also identified a period of vaccine resistance, where people who decided not to be vaccinated were unlikely to be vaccinated without further enforcement or incentive. Conclusions: Findings suggest that vaccine hesitancy occurred because concerns about vaccine safety were high due to a perceived lack of scientific evidence and public trust in healthcare authorities has been seriously undermined. Considering that vaccine-related conspiracy theories and fake news prevailed in the absence of reliable information sources, restoring public trust in healthcare leaders will be critical for future vaccination efforts.

9.
7th IEEE International Conference on Information Technology and Digital Applications, ICITDA 2022 ; 2022.
Article in English | Scopus | ID: covidwho-2191874

ABSTRACT

In 2020, Most Filipinos are using the internet due to COVID-19 pandemic lockdowns. The internet is not limited to adults and children might be exposed to online adult content and abuse. The Philippine Internet Service providers fail to capture pornographic web pages that are not for child viewing. A Web Page classifier would help in detecting and classifying web pages. In this study, a total of 12000 web pages with adult content and academic web pages were collected using scrapy and existing datasets from DMOZ were used to create a Support Vector Machine (SVM) multi-class classifier. To improve the accuracy of the SVM model, data preprocessing was performed to remove noisy and irrelevant data from the dataset. The text in the web pages was used to train the SVM classifier by using Term Frequency and Inverse Document Frequency, Count vectorizer, and Word2vec Skip-gram embedding with TF-IDF as a multiplier. A series of experiments were conducted using multiple word embedding techniques. The SVM model built using word2vec with TF-IDF multiplier outperforms the SVM model built using TF-IDF and Count Vectorizer. The word embedding generated using word2vec was generated with a window size of 9 and a vector dimension of 900. The SVM model built using word2vec shows an S6% accuracy. The SMV model is deployed in the Django framework and a chrome plugin was created to use the SVM model using REST API. © 2022 IEEE.

10.
International Journal on Technical and Physical Problems of Engineering ; 14(4):211-218, 2022.
Article in English | Scopus | ID: covidwho-2169877

ABSTRACT

This paper presents a sentiment retrieval based on natural language processing NLP-based Word2Vec method for health records in medical institutions of Iraq especially for Covid-19 patients. Sentiment retrieval of medical records has gained significant attention worldwide to understand the behaviors of both clinicians and patients. However, Sentiment retrieval of medical notes still not provides a clear picture of information retrieving from these summaries. Covid-19 Pandemic urges researchers in the field of medical records and AI modelling to establish a sentiment analysis based on discharge summary notes. The study is performed on 10000 medical notes from general hospitals with total of 8500 patients and a 15000 medical notes from general clinics with total of 12000 patients. The study is conducted in Iraq during May 2021 to May 2022. The main intensity of measured sentiment is captured with positive or negative in the health records. The SentiWordNet platform is used to standardize a gold sentiment dataset and the performance is evaluated using Word2Vec method. The Welch's t-test is used to validate the significance of the obtained results. It has been found that the statistical significance between Covid-19 health records reaches to 94.6% with p-value of 0.054. © 2022, International Organization on 'Technical and Physical Problems of Engineering'. All rights reserved.

11.
4th International Conference on Futuristic Trends in Networks and Computing Technologies, FTNCT 2021 ; 936:749-763, 2022.
Article in English | Scopus | ID: covidwho-2148681

ABSTRACT

Each gender is having its special behaviour which can be reflected in every field of social media. During the pandemic of COVID-19, people used twitter to discuss the issues caused by COVID-19 disease. As Twitter does not disclose the gender of the user, in this study we have discussed different kinds of approaches used to identify the gender. From the literature review, it is found that the dictionary-based approaches are the best suitable approach when we are working with the sentiment analysis of unlabelled data. This study is about the analysis of ten kinds of emotions of males and females by which we can observe how they reacted in this pandemic. The research proposes a dictionary-based approach to identify the gender and then analyzed sentiments using the cluster-based approach is applied onto word vectors after multiplying them with sentence’s polarity. The proposed approach is compared with the existing approaches with different data set and found that our proposed approach depicts good accuracy of sentiment analysis of unlabelled gendered data. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

12.
5th International Conference on Applied Informatics, ICAI 2022 ; 1643 CCIS:121-133, 2022.
Article in English | Scopus | ID: covidwho-2148607

ABSTRACT

In this paper we investigate how scientific and medical papers about Covid-19 can be effectively mined. For this purpose we use the CORD19 dataset which is a huge collection of all papers published about and around the SARS-CoV2 virus and the pandemic it caused. We discuss how classical text mining algorithms like Latent Semantic Analysis (LSA) or its modern version Latent Drichlet Allocation (LDA) can be used for this purpose and also touch more modern variant of these algorithms like word2vec which came with deep learning wave and show their advantages and disadvantages each. We finish the paper with showing some topic examples from the corpus and answer questions such as which topics are the most prominent for the corpus or how many percentage of the corpus is dedicated to them. We also give a discussion of how topics around RNA research in connection with Covid-19 can be examined. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

13.
ICIC Express Letters, Part B: Applications ; 13(12):1331-1338, 2022.
Article in English | Scopus | ID: covidwho-2146166

ABSTRACT

Providing course recommendations has proven its importance, especially with the outbreak of COVID-19 and the resulting difficulties in the learning process. Although several works have proposed algorithms in this regard, to our knowledge, these algorithms did not consider the aptitudes of the student. These works used the previous grades of the student along with other features without considering the inherent student aptitudes, which would play a vital role in the final student grade, and in the long run, shaping the student future. This work proposes a novel course recommender based on student aptitudes. In order to provide recommendations, we propose a novel method to extract the inherent student aptitudes. These aptitudes are further enriched using a pre-trained deep learning model to include semantically correlated aptitudes. We adopt the use of nearest neighbor approach in the course recommender and consider the previous-ly extracted student aptitudes. Experimental work proves the efficiency of the proposed methods in terms of accuracy. © 2022 ICIC International.

14.
10th IEEE Region 10 Humanitarian Technology Conference, R10-HTC 2022 ; 2022-September:288-293, 2022.
Article in English | Scopus | ID: covidwho-2136456

ABSTRACT

Internet adoption has increased rapidly during the worldwide COVID-19 pandemic. Nowadays people not only prefer to shop using various e-commerce platforms, but also like to provide feedback and express their opinions and experiences using the online platforms. Since new customers try to understand the products' utility and acceptability from other consumers' reviews, it has become crucial to analyze the customers' sentiments and opinions on each product. In this paper, we have presented a sentiment analysis technique on the basis of product reviews written in Bangla language to better understand the combined consumer perspective. Our work aims to compare existing classifiers' performance and find the best algorithm for our dataset. We collected reviews from the leading Bangla bookselling e-commerce site 'Rokomari.com' for this work. We implemented ML and DL classifier models and compared their overall performance on this dataset. The experimental studies show that the best accuracy is achieved from LSTM and SGD over the other implemented ML and DL based classifier models. © 2022 IEEE.

15.
Comput Struct Biotechnol J ; 20: 5564-5573, 2022.
Article in English | MEDLINE | ID: covidwho-2061048

ABSTRACT

Viral infections represent a major health concern worldwide. The alarming rate at which SARS-CoV-2 spreads, for example, led to a worldwide pandemic. Viruses incorporate genetic material into the host genome to hijack host cell functions such as the cell cycle and apoptosis. In these viral processes, protein-protein interactions (PPIs) play critical roles. Therefore, the identification of PPIs between humans and viruses is crucial for understanding the infection mechanism and host immune responses to viral infections and for discovering effective drugs. Experimental methods including mass spectrometry-based proteomics and yeast two-hybrid assays are widely used to identify human-virus PPIs, but these experimental methods are time-consuming, expensive, and laborious. To overcome this problem, we developed a novel computational predictor, named cross-attention PHV, by implementing two key technologies of the cross-attention mechanism and a one-dimensional convolutional neural network (1D-CNN). The cross-attention mechanisms were very effective in enhancing prediction and generalization abilities. Application of 1D-CNN to the word2vec-generated feature matrices reduced computational costs, thus extending the allowable length of protein sequences to 9000 amino acid residues. Cross-attention PHV outperformed existing state-of-the-art models using a benchmark dataset and accurately predicted PPIs for unknown viruses. Cross-attention PHV also predicted human-SARS-CoV-2 PPIs with area under the curve values >0.95. The Cross-attention PHV web server and source codes are freely available at https://kurata35.bio.kyutech.ac.jp/Cross-attention_PHV/ and https://github.com/kuratahiroyuki/Cross-Attention_PHV, respectively.

16.
2022 Annual Modeling and Simulation Conference, ANNSIM 2022 ; : 778-789, 2022.
Article in English | Scopus | ID: covidwho-2056832

ABSTRACT

This study aims to build clusters of similar research papers. Text clustering for research articles is challenging because re-clustering is necessary to handle newly added papers. An incremental clustering algorithm is presented to find similar research papers for COVID-19 related literature. The proposed approach uses an incremental word embedding generation technique to extract feature vectors of the papers. The initial clustering is done by using the K-means algorithm by two NLP feature extraction models;TF-IDF and Word2vec. The clustering results show that the Word2vec outperforms the TF-IDF model. With increasing COVID-19 literature continuously, the ultimate focus is to add the newly published papers to the existing clusters without re-clustering. Title, , and full body of papers are considered for testing the proposed incremental algorithm. Clustering quality is evaluated by the Microsoft language similarity package, which shows clustering of the full-text body outperforms the and title of papers. © 2022 SCS.

17.
Lecture Notes in Electrical Engineering ; 888:459-466, 2022.
Article in English | Scopus | ID: covidwho-2035003

ABSTRACT

The field of unsupervised natural language processing (NLP) is gradually growing in prominence and popularity due to the overwhelming amount of scientific and medical data available as text, such as published journals and papers. To make use of this data, several techniques are used to extract information from these texts. Here, in this paper, we have made use of COVID-19 corpus (https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge ) related to the deadly corona virus, SARS-CoV-2, to extract useful information which can be invaluable in finding the cure of the disease. We make use of two word-embeddings model, Word2Vec and global vector for word representation (GloVe), to efficiently encode all the information available in the corpus. We then follow some simple steps to find the possible cures of the disease. We got useful results using these word-embeddings models, and also, we observed that Word2Vec model performed better than GloVe model on the used dataset. Another point highlighted by this work is that latent information about potential future discoveries are significantly contained in past papers and publications. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

18.
2nd IEEE International Conference on Intelligent Technologies, CONIT 2022 ; 2022.
Article in English | Scopus | ID: covidwho-2029211

ABSTRACT

Sentiment analysis is a process of extracting opinions into the positive, negative, or neutral categories from a pool of text using Natural Language Processing (NLP). In the recent era, our society is swiftly moving towards virtual platforms by joining virtual communities. Social media such as Facebook, Twitter, WhatsApp, etc are playing a very vital role in developing virtual communities. A pandemic situation like COVID-19 accelerated people's involvement in social sites to express their concerns or views regarding crucial issues. Mining public sentiment from these social sites especially from Twitter will help various organizations to understand the people's thoughts about the COVID-19 pandemic and to take necessary steps as well. To analyze the public sentiment from COVID-19 tweets is the main objective of our study. We proposed a deep learning architecture based on Bidirectional Gated Recurrent Unit (BiGRU) to accomplish our objective. We developed two different corpora from unlabelled and labeled COVID-19 tweets and use the unlabelled corpus to build an improved labeled corpus. Our proposed architecture draws a better accuracy of 87% on the improved labeled corpus for mining public sentiment from COVID-19 tweets. © 2022 IEEE.

19.
J Comput Biol ; 29(9): 1001-1021, 2022 09.
Article in English | MEDLINE | ID: covidwho-2017640

ABSTRACT

The comparison of DNA sequences is of great significance in genomics analysis. Although the traditional multiple sequence alignment (MSA) method is popularly used for evolutionary analysis, optimally aligning k sequences becomes computationally intractable when k increases due to the intrinsic computational complexity of MSA. Despite numerous k-mer alignment-free methods being proposed, the existing k-mer alignment-free methods may not truly capture the contextual structures of the sequences. In this study, we present a novel k-mer contextual alignment-free method (called kmer2vec), in which the sequence k-mers are semantically embedded to word2vec vectors, an essential technique in natural language processing. Consequently, the method converts each DNA/RNA sequence into a point in the word2vec high-dimensional space and compares DNA sequences in the space. Because the word2vec vectors are trained from the contextual relationship of k-mers in the genomes, the method may extract valuable structural information from the sequences and reflect the relationship among them properly. The proposed method is optimized on the parameters from word2vec training and verified in the phylogenetic analysis of large whole genomes, including coronavirus and bacterial genomes. The results demonstrate the effectiveness of the method on phylogenetic tree construction and species clustering. The method running speed is much faster than that of the MSA method, especially the phylogenetic relationships constructed by the kmer2vec method are more accurate than the conventional k-mer alignment-free method. Therefore, this approach can provide new perspectives for phylogeny and evolution and make it possible to analyze large genomes. In addition, we discuss special parameterization in the k-mer word2vec embedding construction. An effective tool for rapid SARS-CoV-2 typing can also be derived when combining kmer2vec with clustering methods.


Subject(s)
Algorithms , COVID-19 , Base Sequence , Humans , Phylogeny , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods
20.
7th IEEE International conference for Convergence in Technology, I2CT 2022 ; 2022.
Article in English | Scopus | ID: covidwho-1992606

ABSTRACT

During pandemics such as COVID-19, government announcements were sources to convey accurate and relevant information to the public in times of outbreak. Prior studies attempted to explore the public awareness and behavioral changes from various research disciplines in response to the COVID-19 pandemic. Literature has pointed out that the appropriate use of information sources significantly relates to public attitudes in battling the pandemic. Social media has been the widely used medium to express public interests in current events. Literature shows that social media use during a crisis effectively coordinates relevant information from different sources and promotes situational awareness. Therefore, it is crucial to investigate scalable approaches to promptly gather insights into the public's interests and how governments responded to the interests relevant to the COVID-19 pandemic. However, there is little empirical research found that tackles these needs. Therefore, we aim to close the research gap by examining the feasible approaches for (1) identifying if public information-seeking has similar patterns as information-sharing on social media during the COVID-19 pandemic, and (2) comparing the patterns with the government announcements to confirm if the announcements show aligned response to the public information-seeking and sharing during the COVID-19 pandemic. We applied text processing, LDA topic modeling, and Word Mover Distance techniques to realize our aim through a Malaysian case study. Our research work contributes to the application of the LDA-Word2Vec-Word Mover Distance architecture and algorithms that can be used for future investigation and comparison of information seeking and sharing patterns in different research subjects. © 2022 IEEE.

SELECTION OF CITATIONS
SEARCH DETAIL